Predictive nomogram for lymph node metastasis and survival in gastric cancer using contrast-enhanced computed tomography-based radiomics: a retrospective study

Background Lymph node involvement significantly impacts the survival of gastric cancer patients and is a crucial factor in determining the appropriate treatment. This study aimed to evaluate the potential of enhanced computed tomography (CT)-based radiomics in predicting lymph node metastasis (LNM) and survival in patients with gastric cancer before surgery. Methods Retrospective analysis of clinical data from 192 patients diagnosed with gastric carcinoma was conducted. The patients were randomly divided into a training cohort (n = 128) and a validation cohort (n = 64). Radiomic features of CT images were extracted using the Pyradiomics software platform, and distinctive features were further selected using a Lasso Cox regression model. Features significantly associated with LNM were identified through univariate and multivariate analyses and combined with radiomic scores to create a nomogram model for predicting lymph node involvement before surgery. The predictive performance of radiomics features, CT-reported lymph node status, and the nomogram model for LNM were compared in the training and validation cohorts by plotting receiver operating characteristic (ROC) curves. High-risk and low-risk groups were identified in both cohorts based on the cut-off value of 0.582 within the radiomics evaluation scheme, and survival rates were compared. Results Seven radiomic features were identified and selected, and patients were stratified into high-risk and low-risk groups using a 0.582 cut-off radiomics score. Univariate and multivariate analyses revealed that radiomics features, diabetes mellitus, Nutrition Risk Screening (NRS) 2002 score, and CT-reported lymph node status were significant predictors of LNM in patients with gastric cancer. A predictive nomogram model was developed by combining these predictors with the radiomics score, which accurately predicted LNM in gastric cancer patients before surgery and outperformed other models in terms of accuracy and sensitivity. The AUC values for the training and validation cohorts were 0.82 and 0.722, respectively. The high-risk and low-risk groups in both the training and validation cohorts showed significant differences in survival rates. Conclusion The radiomics nomogram, based on contrast-enhanced computed tomography (CECT ), is a promising non-invasive tool for preoperatively predicting LNM in gastric cancer patients and postoperative survival.


BACKGROUND
Digestive cancer continues to be the primary cause of death globally (Shi et al., 2018;Ma et al., 2016;Zhang et al., 2021), with gastric cancer ranking as a prevalent malignancy and the second leading cause of cancer-related mortality worldwide (Bray et al., 2018).The presence of peri-gastric lymph node metastasis (LNM) stands as an independent prognostic factor for gastric cancer (Bando et al., 2002;Deng et al., 2014), emphasizing its critical role in developing standardized and effective treatment approaches for this condition.
Currently, the assessment of lymph node status in gastric cancer patients involves the use of endoscopic ultrasonography (EUS), computed tomography (CT), and magnetic resonance imaging (MRI).However, these imaging modalities exhibit significant variations in sensitivity and specificity, leading to suboptimal rates of LNM detection.While several molecular diagnostic biomarkers for LNM in gastric cancer patients have been identified (Hiroshi et al., 2009), their practical application is impeded by factors such as high costs and technical complexities.Presently, CT serves as the primary imaging tool for preoperative lymph node status evaluation in gastric cancer patients.Nevertheless, with a detection accuracy of only 60% (Kim, Kim & Ha, 2005;Giganti et al., 2017;Park et al., 2012), there is a clear need for a more dependable, and precise approach.
Radiomics pertains to the extraction of quantitative features from digital medical images using specialized algorithms aimed at informing clinical decision-making (Park et al., 2012;Kim et al., 2005).This approach offers a prospective non-invasive method for assessing tumor heterogeneity by integrating numerous imaging features (Gillies, Kinahan & Hricak, 2016;Aerts et al., 2014;O'Connor et al., 2017).Notably, radiomics is increasingly utilized for cancer screening, subtype classification, lymph node metastasis (LNM) detection, survival prognosis, and treatment response evaluation in the pursuit of personalized medicine (O'Connor et al., 2017;Coroller et al., 2015;Banerjee et al., 2015;Huang et al., 2016;Li et al., 2016;Jiang et al., 2018;Yoon et al., 2016).While the texture features of CT images have been linked to survival among gastric cancer patients (Giganti, Tang & Baba, 2019;Fu et al., 2015;Badgwell et al., 2016), the predictive features for LNM remain largely unexplored.Rare of the previously reported radiomics models have accurately predicted the presence or absence of LNM in gastric cancer patients based on CT images, and the relevant studies were adapted from MRI-based research methods (Van et al., 2017;Sun et al., 2015).
The aim of this study was to identify distinct contrast-enhanced CT (CECT) imaging characteristics to preoperatively assess lymph node metastasis (LNM) in individuals with gastric cancer.Subsequently, a radiomics nomogram was developed by integrating imaging features with clinicopathological traits.

Patients
This study involved 192 gastric cancer (GC) patients, comprising 145 males and 47 females, with a mean age of 65.0 ± 10.5 years.The clinical data was obtained from individuals who underwent gastrectomy at the Department of Gastrointestinal Surgery at The First Affiliated Hospital of Wenzhou Medical University between November 2014 and December 2016.The study adhered to the ethical standards of the Declaration of Helsinki and received approval from the First Affiliated Hospital of Wenzhou Medical University (KY2014-R230).Prior to participation, all patients provided written informed consent.Inclusion criteria encompassed a histopathological diagnosis of GC, assessment of LN status in the postoperative pathological report, and the performance of contrastenhanced abdominal CT 2 weeks preoperatively.Exclusion criteria comprised patients who had received neoadjuvant chemotherapy or radiotherapy before surgery, lacked high-quality contrast-enhanced abdominal CT images due to artifacts, poor expansion, or imaging manifestations, underwent palliative surgery, or had chronic and heterochronic malignant tumors.Baseline clinicopathological data, including demographic details, clinical indicators, and pathological staging data, were sourced from patients' medical records, covering factors such as gender, preoperative hemoglobin concentrations, preoperative serum albumin levels, platelet-lymphocyte ratio (PLR), neutrophil-lymphocyte ratio (NLR), presence or absence of medical conditions (e.g., hypertension, obesity, and diabetes mellitus), Charlson Comorbidity Index (CCI), Nutrition Risk Screening 2002 (NRS-2002) score, CT-reported LN status, tumor size, tumor location, tumor differentiation, histopathological type, pathological tumor-node-metastasis (TNM) stage, levels of carcinoembryonic antigen, and carbohydrate antigen 19-9 levels.

Image acquisition, tumor segment isolation, and feature extraction
All participants underwent comprehensive abdominal enhanced CT scanning using a 64slice spiral CT apparatus (Siemens; Munich, Germany), with a delineated slice thickness ranging from 0.75 to 1.25 mm.The portal phase CT scan images were processed using ITK-SNAP software (version 3.8.0;USA; http://www.itksnap.org/)to semi-automatically delineate the tumor-affected region.Two radiologists collaboratively demarcated the tumor area, and their assessment was subsequently authenticated by a third radiologist with similar expertise.The delineated region of interest (ROI) is presented in Figs.1A1-1A3; 1B1-1B2.The original CT image and the demarcated ROI were stored as medical digital imaging files in the Nearly Raw Raster Data (NRRD) format.For automated feature extraction, Pyradiomics21, a Python programming environment tool (version 3.7.2;https://python.org/),was utilized.Detailed descriptions of the tumor feature extraction, parameter calibration, and Z-score standardization processes can be found in the Supplementary Material for reference and review.
The CT images were processed using ITK-SNAP (version 3.8.0;http://www.itksnap.org/).The delineation of the gastric tumor region was performed by a skilled general surgeon and subsequently assessed and confirmed by a radiologist.An outlined depiction of the patient-specific ROI is shown in Fig. 1 (subsection A1-A3; B1-B3).The original CT images and the defined ROI were saved as medical digital imaging files in the NRRD format.Pyradiomics21 in a Python environment (version 3.7.2;available at https://python.org/) was employed for automated feature extraction.Specifics regarding adjustment parameters for feature extraction from the gastric tumor area and the Z -score standardization processes can be found in the Supplemental Information.

Screening of valuable characteristics and establishment of the diagnostic model
The participants were randomly divided into training and validation sets in a 2:1 ratio to ensure robust and generalizable results.This division was performed meticulously and impartially to minimize selection bias, enhancing both the internal and external validity of the study while reducing the risk of systematic differences between the cohorts, thus fortifying the reliability and reproducibility of the results.Random assignment of patients to these sets also strengthened the methodological rigor employed in this research, increasing the credibility of the results.The outcome variable in the Lasso Cox regression model was the presence or absence of lymph node metastasis (LNM) in gastric cancer patients.The feature selection process employed a robust and systematic approach to identify the most relevant imaging characteristics for the predictive model, including an analysis of all patient cohorts to ensure data validity.Within the training set, feature selection was conducted via Lasso-Cox regression analysis to identify discriminative imaging features.Comprehensive analysis of all 833 characteristics was scrutinized via Lasso Cox regression, facilitating the selection of significant features further assessed by logistic regression.Subsequently, the validation subset was used to ascertain the accuracy of the established radiomics-based diagnostic model, addressing concerns about overfitting due to the large number of features in relation to the sample size through cross-validation.
In the cross-validation process, we employed a 10-fold cross-validation technique to assess the robustness and generalizability of the developed radiomics-based nomogram model.Specifically, we utilized a k-value of 10 to partition the dataset into 10 subsets, ensuring that each subset was used as both a training and validation set.By iteratively training the model on k-1 folds and validating on the remaining fold, we obtained an average performance measure, thereby mitigating the impact of overfitting and yielding reliable estimates of the model's predictive capability across different subsets of the data.
To bolster the model's robustness and mitigate potential overfitting pitfalls, and demonstrate a commitment to methodological rigor and data integrity, following the radiomics diagnosis, univariate and multivariate analyses were performed on the diagnostic factors, and receiver operating characteristic (ROC) curves were generated and analyzed to facilitate a comparative assessment of distinct diagnostic models.

Statistical methods
To ensure the even distribution of continuous parameters, the Kolmogorov-Smirnov test was employed.Normal distribution is presented as means ±standard deviations, whereas non-normally distributed data are represented by medians accompanied by interquartile ranges.Logistic regression analysis to assess the radiomic features was conducted using the ''glmnet'' package of R software.Intergroup differences in continuous variables were assessed using the Wilcoxon rank sum test, while for categorical variables, the chi-squared test or Fisher's exact test was employed.
A visual nomogram model was constructed using data from univariate analyses via multivariate logistic regression analysis.The univariate analysis aimed at identifying individual factors significantly correlated with the presence or absence of LNM in GC patients and factors independently influencing LNM.In contrast, the multivariate analysis determined the combined influence of multiple factors on LNM presence or absence, elucidating the independent and collective impact of various variables and their interplay and interdependencies with LNM.This comprehensive analysis of variable selection for the nomogram entailed rigorous univariate and multivariate analyses to identify the most significant predictors of lymph node metastasis (LNM) in GC patients.
The univariate regression analysis revealed several factors significantly correlating with the presence or absence of LNM, including radiomics evaluation, presence of diabetes mellitus, Nutrition Risk Screening (NRS) 2002 score, preoperative hemoglobin level, platelet-lymphocyte ratio (PLR), neutrophil-lymphocyte ratio (NLR), CT-reported lymph node status, and tumor size (p < 0.05).These findings demonstrate that LNM prediction depends on multiple factors, including both radiomic features and clinical parameters.Subsequent multivariate regression analysis further refined the variable selection process, revealing that radiomics evaluation, NRS-2002 score, CT-reported lymph node status, and diabetes mellitus exhibited a significant correlation with the presence or absence of LNM in GC patients (p < 0.01).The inclusion of these variables for nomogram model construction reflects a robust approach to using radiomic and clinical attributes for predicting LNM, thereby enhancing the model's accuracy for preoperative assessment.
The efficacy and accuracy of the finalized model were assessed using the ROC curve, with statistical significance set at p < 0.05.The reported p-values in the statistical analyses and evaluations were considered two-tailed.These analytical processes were carried out using R software (version 3.6.0;http://www.R-project.org) and the IBM Statistical Package for the Social Sciences Statistics (version 22.0, IBM Corp., Armonk, NY, USA).

Clinical characteristics
The demographic and clinical characteristics of all patients are presented in Table 1.The mean age of the cohort was 65 ± 10.5 years, and there was no significant difference between the training cohort (65.2 ± 10.4 years) and the validation cohort (64.6 ± 10.7 years) (p = 0.716).Additionally, the two cohorts exhibited similarity in terms of BMI, hemoglobin levels, albumin levels, PLR, NLR, sex distribution, hypertension, diabetes, Charlson Comorbidity Index (CCI), and various pathological and tumor characteristics.These findings indicate that the training and validation cohorts were well-matched, allowing for subsequent analyses and model development.

Construction of the radiomics model
Seven radiomics features were selected in the training cohort using lasso Cox regression based on lambda.min,,including riginal_shape_Maximum2DDiameterSlice, origi-nal_shape_SurfaceVolumeRatio, wavelet.LHL_glcm_ClusterShade, wavelet.LHL_ngtdm_ Strength, wavelet.LHH_ngtdm_Strength, wavelet.LLL_firstorder_10Percentile, and wavelet.LLL_firstorder_Median (Figs.2A-2B).The diagnostic performance of the radiomic features for LNM was evaluated by plotting an ROC curve, and the AUC was 0.76, indicating substantial accuracy.Based on the maximum Youden index of the training cohort, the radiomic score of 0.582 was determined as the cut-off, and the patients were stratified into the high-risk and low-risk categories (Figs.3A-3B).Univariate regression analysis showed that the radiomics score, radiomics features, diabetes mellitus, NRS-2002 score, preoperative hemoglobin level, PLR, NLR, CT-reported LN status, and tumor size were significantly correlated with LNM (P<0.05).Furthermore, the radiomics features, NRS-2002 score, CT-reported LN status, and diabetes mellitus were identified as the independent predictors of LMN as per multivariate regression analysis (P < 0.01; Table 2), and were combined into a nomogram model to predict LNM within the abdominal cavity of gastric cancer patients (Fig. 4A).

Clinical evaluation of the radiomics-based nomogram
The AUC values of the nomogram in the training and validation cohorts were 0.82 and 0.722, respectively, while those for the radiomics features were 0.717 and 0.686, and for CT-reported lymph node status were 0.663 and 0.65 (Figs.4B-4C).Therefore, our radiomics nomogram exhibited good discriminatory performance, comparable to that of radiomics features, and superior relative to conventional CT scans.

The radiomics score is associated with the overall survival
Both cohorts' patients were divided into high-risk and low-risk groups according to the radiomics score, and their overall survival was compared using the Kaplan-Meier method.

DISCUSSION
We selected seven radiomic features from the abdominal CECT images of gastric cancer patients, including original_shape_Maximum2DDiameterSlice, origi-nal_shape_SurfaceVolumeRatio, wavelet.LHL_glcm_ClusterShade, wavelet.LHL_ngtdm_ Strength, wavelet.LHH_ngtdm_Strength, wavelet.LLL_firstorder_10Percentile, and wavelet.LLL_firstorder_Median, for the preoperative prediction of LNM.Previous research has illustrated the association of certain radiomic features with tumor heterogeneity, microenvironment characteristics, and treatment response, all crucial to cancer progression and metastasis.While the specific biological significance of these features was not directly addressed in this study, it is pivotal to elucidate their potential biological and pathological relevance concerning LNM in gastric cancer.CT-based radiomics provides a non-invasive and personalized approach for predicting the risk of LNM in gastric cancer patients, distinguishing itself from other diagnostic methods.Despite the well-established prognostic utility of radiomics features, the generalization of these models across different centers is constrained by variations in CT sources and critical characteristics observed in various studies.Our approach involves the development of a universal diagnostic model for LNM using open-source software, contrasting with closed diagnostic systems employed by other researchers.This universal model can be replicated by other centers, making it more suitable for wider hospital populations.
An increasing number of treatment guidelines advocate preoperative neoadjuvant chemotherapies for gastric cancer patients, given that, compared with surgery alone, the addition of neoadjuvant chemotherapy confers a survival advantage without increasing postoperative morbidity and mortality (Schwarz, 2015;Newton et al., 2015).LNM serves as a critical determinant of therapeutic interventions for gastric cancer (Dong et al., 2016;Wu et al., 2011), and preoperative neoadjuvant therapy is routinely recommended for patients with LNM, since it has been demonstrated to downstage the lymph node status and increase the likelihood of achieving R0 resection (Schuhmacher et al., 2010;Mocellin, Marchet & Nitti, 2011;Cardoso et al., 2012).Therefore, the construction of predictive models for the preoperative identification and discrimination of LNM is imperative to establish personalized treatment regimens.Although endoscopic ultrasonography (EUS) is beneficial for local staging of GC, accurately defining the T stage, it demonstrates poor reliability in predicting the presence or absence of LNM (Philippe et al., 2012).Likewise, while thin-section CT plays a critical role in preoperative lymph node staging, (Liu et al., 2020), the accuracies of both EUS and CT for the preoperative prediction of LNM remain unsatisfactory at 64% and 61-64%, respectively (Wang et al., 2020;Li et al., 2018).
A predictive accuracy of 62% was also computed with routine CT scanning.
The rise of high-throughput data has driven a shift toward precision medicine in cancer diagnostics and treatment, triggering the rapid evolution of alternative approaches such as radiomics, owing to the limitations of traditional imaging techniques.Introduced in 2012, the concept of radiomics has been widely and effectively employed in clinical research (Fukagawa et al., 2018).For instance, Jiang et al. (2018) was able to identify 15 CT image texture features significantly correlated with preoperative LNM in gastric cancer patients, serving as independent predictors of the pN stage.In our study, we focused on seven imaging features from a total of 833, selected based on their significant correlation with LNM at the smallest lambda value.While our resulting model exhibits greater simplicity, it necessitates further investigation.Additionally, in the MRI domain, Liu et al. analyzed radiomics features and discovered a significant correlation between the whole-lesion   apparent diffusion coefficient histogram and LNM, demonstrating a high predictive accuracy of 82.3% (Mocellin, Marchet & Nitti, 2011).Wang et al. (2020) established a CT-based radiomics model for preoperatively predicting LNM in gastric cancer, demonstrating substantial discriminatory power (Cardoso et al., 2012).We employed contrast-enhanced CT (CECT) as the foundation for our radiomics models, given its widespread use for preoperative lymph node status evaluation, along with its greater convenience and reliability compared to MRI.CECT showed superior diagnostic ability for LNM compared to routine CT, with AUC values of 0.844 and 0.837 for the training and validation cohorts, respectively.Similarly, radiomics features exhibited enhanced discriminatory ability for LNM compared to routine CT, achieving an accuracy of 80-84%.This aligns with Wang et al.'s (2020) findings, showcasing the potential of radiomics features to enrich image interpretations and complement routine CT scans for evaluating the lymph node status in gastric cancer patients.
In our study, we integrated the radiomics scores with clinically significant parameters related to LNM, culminating in a nomogram model designed for rapid, convenient, and reliable lymph node analysis to guide personalized treatment.Wang et al. (2020) developed a predictive nomogram based on routine CT scans, achieving AUC values of 0.886 in the training cohort and 0.881 in the test cohort, with an 84% accuracy in both (Cardoso et al., 2012).Additionally, Li et al. (2018) developed a nomogram using intra-tumoral iodine concentration and the Borrmann classification to predict LNM preoperatively, achieving AUC values of 0.76 and 0.793, and corresponding accuracy rates of 0.7 and 0.757 in training and validation cohorts, respectively (Cardoso et al., 2012;Philippe et al., 2012).Our CECT-based radiomics nomogram model surpassed CECT evaluation and radiomics features in LNM prediction, attaining high AUC values of 0.82 and 0.722 in the training and validation cohorts, respectively.Stratifying patients based on the radiomics scores into high-risk and low-risk groups revealed significant differences in overall survival in both cohorts, highlighting the potential of our novel nomogram model to predict the prognosis of gastric cancer patients.(Liu et al., 2020;Wang et al., 2020) Radiomics, with its straightforward yet robust visual analysis and routine imaging tools, enables the extraction of parameters that conventional diagnostic methods might overlook.Additionally, a nomogram based on radiomics features can aid clinicians in identifying suitable candidates for neoadjuvant therapy, particularly considering the impact of LNM This study has several limitations that warrant consideration.Firstly, the retrospective design and single-center cohort constrain the generalizability of our model, necessitating validation in multicenter cohorts.Secondly, our findings were not externally validated, highlighting the need for future studies with external cohorts.Furthermore, we only categorized patients based on LNM status without examining the potential influence of specific stages of metastasis (N1 to N3b) or the anatomical sites of metastasis within the 16 designated locations.Lastly, we acknowledge the inherent limitations associated with the small sample size in the current study, which may impact the generalizability and robustness of the findings.As such, future studies involving larger, multicenter cohorts are essential to validate the predictive performance of the radiomics nomogram model and its utility in clinical practice.

CONCLUSION
The nomogram model based on radiomics data could be beneficial for preoperative prediction of LNM and postoperative survival analysis of gastric cancer patients.

Figure 2
Figure 2 Lasso regularization plots for variable selection.The model featuring the most minimal lambda value was chosen (lambda = 0.081, Log (lambda) = −2.507),and seven attributes, identified from the lasso analysis, were integrated into the ensuing logistic regression to formulate the radiomics score.Full-size DOI: 10.7717/peerj.17111/fig-2

Figure 3
Figure 3 Receiver operating characteristic curves of the radiomics model in the training and validation cohort.Full-size DOI: 10.7717/peerj.17111/fig-3

Figure 4
Figure 4 Enhanced CT-based radiomics nomogram and comparative receiver operating characteristic (ROC) curves.(A) Enhanced computed tomography (CT)-based radiomics nomogram for the prediction of lymph node (LN) metastasis in patients with gastric cancer; (B-C) Comparison of ROC curves of three diagnostic variables (radiomics evaluation, CT-reported LN status, and the nomogram) in the training and validation cohort.Full-size DOI: 10.7717/peerj.17111/fig-4